security: cap TAR entry size and gzip decompression in archive/OVA parsers by adilburaksen · Pull Request #2002 · google/osv-scalibr

adilburaksen · 2026-04-26T13:16:50Z

Summary

common/common.go: Export MaxTAREntryBytes = 2 GiB constant; check hdr.Size against the limit before calling io.Copy; wrap io.Copy with io.LimitReader as defence-in-depth.
archive/archive.go: Add maxGzipDecompressedBytes = 8 GiB; wrap gzip.Reader with io.LimitReader before passing to TARToTempDir, preventing decompression bombs in .tar.gz inputs.
archive/security_regression_test.go: Two new tests verifying oversized entries in .tar and .tar.gz are rejected before any disk write.

Motivation

common.TARToTempDir had no per-entry size guard: a malicious TAR whose entry declared an arbitrarily large hdr.Size would cause io.Copy to exhaust disk space on the scanner host.

The archive extractor's .tar.gz path had a secondary gap: maxFileSizeBytes in FileRequired checks only the compressed file size. A small .tar.gz (e.g. 1 MB compressed → 100 GB decompressed) bypasses that check and can exhaust disk during extraction. The io.LimitReader wrapper caps total decompressed output.

The OVA extractor calls TARToTempDir directly and is protected by the new per-entry guard without requiring its own changes.

Test plan

go test ./extractor/filesystem/embeddedfs/... passes (all packages)
TestTAREntryOverLimitRejected — .tar with hdr.Size > MaxTAREntryBytes returns error
TestGzippedTAREntryOverLimitRejected — .tar.gz with same oversized entry returns error

🤖 Generated with Claude Code

ClusterBits is read directly from the untrusted header and used as a shift exponent: clusterSize = uint64(1) << header.ClusterBits. This value is then passed to make() and bytes.Repeat() without any bounds check. A 116-byte crafted .qcow2 file with ClusterBits=33 causes scalibr to attempt an 8 GB heap allocation inside readL2Table, crashing the process with "runtime: out of memory". Fix: reject ClusterBits outside the QCOW2 specification range [9, 21] (512 B – 2 MB clusters) in parseHeader, before any clusterSize is computed. After fix: same 116-byte input returns an error immediately with 0 MB TotalAlloc delta. Adds security regression test TestConvertQCOW2ClusterBitsRejected. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

Three header fields are used as allocation sizes without validation: 1. ClusterBits — used as shift exponent: 1<<ClusterBits. Value 33 → 8 GB make([]byte, 8GB) in readL2Table (format.go:218). Confirmed live: 116-byte file causes "runtime: out of memory". 2. L1Size — used directly: make([]uint64, L1Size). Value 0x1FFFFFFF → 4 GB allocation in readL1Table. 3. RefcountTableClusters — multiplied with uint32(clusterSize) in readRefcountTable. With valid ClusterBits=9, value 0x7FFFFF yields a ~4 GB uint32 result without overflow. Fix: reject all three out-of-range values in parseHeader before any clusterSize or allocation is computed. - ClusterBits: spec range [9, 21] - L1Size: cap at 2M entries (covers ≥64 TiB at minimum cluster size) - RefcountTableClusters: cap at 64 (typical images use 1–3) Regression test: TestConvertQCOW2ClusterBitsRejected confirms 116-byte malicious input is rejected with 0 MB TotalAlloc delta. All 38 existing tests pass. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

…rsers common.TARToTempDir had no per-entry size limit: a malicious TAR could declare an arbitrarily large file (hdr.Size) causing io.Copy to exhaust disk space. The archive extractor's .tar.gz path also lacked a total decompression cap, allowing a compression bomb (e.g. 1 MB → 100 GB) to exhaust disk on the scanner host. - extractor/filesystem/embeddedfs/common/common.go - Export MaxTAREntryBytes = 2 GiB constant. - Check hdr.Size against MaxTAREntryBytes before creating any file. - Wrap io.Copy with io.LimitReader(MaxTAREntryBytes+1) and post-copy size check as defence-in-depth. - extractor/filesystem/embeddedfs/archive/archive.go - Add maxGzipDecompressedBytes = 8 GiB constant. - Wrap gzip.Reader with io.LimitReader before passing to TARToTempDir, capping total decompressed output for .tar.gz files. - extractor/filesystem/embeddedfs/archive/security_regression_test.go - TestTAREntryOverLimitRejected: .tar with hdr.Size > MaxTAREntryBytes. - TestGzippedTAREntryOverLimitRejected: .tar.gz with same oversized entry. The OVA extractor calls TARToTempDir directly and is protected by the MaxTAREntryBytes guard without needing its own changes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

google-cla · 2026-04-26T13:17:07Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

adilburaksen and others added 3 commits April 20, 2026 22:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

security: cap TAR entry size and gzip decompression in archive/OVA parsers#2002

security: cap TAR entry size and gzip decompression in archive/OVA parsers#2002
adilburaksen wants to merge 3 commits intogoogle:mainfrom
adilburaksen:fix/archive-ova-tar-extraction-size-limits

adilburaksen commented Apr 26, 2026

Uh oh!

google-cla Bot commented Apr 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

adilburaksen commented Apr 26, 2026

Summary

Motivation

Test plan

Uh oh!

google-cla Bot commented Apr 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant